Genetic reanalysis of patients with a difference of sex development carrying the NR5A1/SF-1 variant p.Gly146Ala has discovered other likely disease-causing variations

NR5A1/SF-1 (Steroidogenic factor-1) variants may cause mild to severe differences of sex development (DSD) or may be found in healthy carriers. The NR5A1/SF-1 c.437G>C/p.Gly146Ala variant is common in individuals with a DSD and has been suggested to act as a susceptibility factor for adrenal disease or cryptorchidism. Since the allele frequency is high in the general population, and the functional testing of the p.Gly146Ala variant revealed inconclusive results, the disease-causing effect of this variant has been questioned. However, a role as a disease modifier is still possible given that oligogenic inheritance has been described in patients with NR5A1/SF-1 variants. Therefore, we performed next generation sequencing (NGS) in 13 DSD individuals harboring the NR5A1/SF-1 p.Gly146Ala variant to search for other DSD-causing variants and clarify the function of this variant for the phenotype of the carriers. Panel and whole-exome sequencing was performed, and data were analyzed with a filtering algorithm for detecting variants in NR5A1- and DSD-related genes. The phenotype of the studied individuals ranged from scrotal hypospadias and ambiguous genitalia in 46,XY DSD to opposite sex in both 46,XY and 46,XX. In nine subjects we identified either a clearly pathogenic DSD gene variant (e.g. in AR) or one to four potentially deleterious variants that likely explain the observed phenotype alone (e.g. in FGFR3, CHD7). Our study shows that most individuals carrying the NR5A1/SF-1 p.Gly146Ala variant, harbor at least one other deleterious gene variant which can explain the DSD phenotype. This finding confirms that the NR5A1/SF-1 p.Gly146Ala variant may not contribute to the pathogenesis of DSD and qualifies as a benign polymorphism. Thus, individuals, in whom the NR5A1/SF-1 p.Gly146Ala gene variant has been identified as the underlying genetic cause for their DSD in the past, should be re-evaluated with a NGS method to reveal the real genetic diagnosis.


Introduction
Typical sex development depends on a tightly controlled network of genes and pathways [1][2][3]. Any perturbation in this very complex biological event, which involves genetic and hormonal processes, may result in atypical sex development and leads to a broad range of differences of sex development (DSD) in humans, also known as variations of sex characteristics (VSC) [4]. The NR5A1 gene, which encodes the steroidogenic factor 1 (SF-1), regulates multiple genes implicated in adrenal development, gonadal determination and differentiation, steroidogenesis, and reproduction [5]. Since the identification of the first NR5A1/SF-1 variation in a 46,XY patient with primary adrenal failure and complete gonadal dysgenesis [6], the gonadal and reproductive phenotype associated with NR5A1/SF-1 variants has broadened including 46,XY and 46,XX cases with sex reversal to minor anomalies such as hypospadias or even healthy carriers, whereas adrenal failure has only been found in very rare cases [7,8].
Currently, 291 different NR5A1/SF-1 variants are reported in the Human Gene Mutation Database (HGMD, October 2022). Variants include missense/nonsense, indels, small insertions/deletions, complete gene deletions or splice-site variants that are scattered throughout the whole gene without obvious hot spots. According to ACMG (American College of Medical Genetics and Genomics) classification [9], described variants may qualify as (likely) pathogenic or (likely) benign, and several are variants of unknown significance (VUS).
The nonsynonymous NR5A1/SF-1 c.437G>C/p.Gly146Ala (rs1110061) variant has been first described in a group of Japanese patients presenting with a series of adrenal diseases such as Cushing's syndrome or non-functioning adrenocortical adenoma [10]. In this context, WuQiang et al. reported that the p.Gly146Ala variant slightly impairs the transactivation ability of adrenal and ovary specific gene promoters but does not affect cofactor interaction and cellular localization [10]. Later, it has been proposed to act as a susceptibility factor for cryptorchidism [11], isolated micropenis [12,13], spermatogenic failure [14,15], primary ovarian insufficiency (POI) [16] and type 2 diabetes [17]. The p.Gly146Ala variant is common among individuals with a 46,XY DSD with a prevalence varying between 6.8 and 31% [18,19]. However, the minor allele frequency (MAF) (C allele) is also high in the overall control population (23.5%, gnomAD v3.1.2). Specifically, its MAF is increased approximately by 1.3-3-fold in the East Asian and the African control populations, whereas it is only present in 1% of the European control population (gnomAD v3.1.2). Moreover, in vitro studies of transcriptional activity of the NR5A1/SF-1 p.Gly146Ala variant on several target promoters in various cell models found unaltered wild-type functionality [18,20]. In fact, some patients who carry severe, pathogenic NR5A1/SF-1 variants in compound heterozygous state with the p.Gly146Ala variant, do not phenotypically differ from individuals carrying the severe variant only [8,[19][20][21][22][23][24][25]. The DSD causing effect of the NR5A1/SF-1 p.Gly146Ala variant is therefore in doubt. However, given that oligogenic inheritance has been suggested for explaining the broad phenotype observed in individuals and families with NR5A1/SF-1 gene variants [26-32], the p.Gly146Ala variant might play a role as co-regulator or disease modifier of sexual development.
The aim of this study was therefore, to elucidate the role of the NR5A1/SF-1 p.Gly146Ala variant on sexual development. We studied 13 DSD patients carrying this variant by next generation sequencing (NGS). Specifically, we searched for other DSD-causing variants and their pathogenicity in order to assess the effect of the NR5A1/SF-1 p.Gly146Ala variant on the phenotype of its carriers.

Patients and ethical approval
We recruited 13 pediatric DSD individuals carrying the NR5A1/SF-1 p.Gly146Ala variant from a larger cohort of 124 DSD patients (98 patients with a 46,XY karyotype, 24 with 46,XX, and 2 patients with a 45X/46,XY karyotype) collected at the Biocruces Bizkaia Health Research Institute (Barakaldo, Spain). Clinical data were provided by the caring clinicians (Table 1 and  S2 Table). The study was approved by the ethics committee for clinical research of Euskadi (CEIC-E), Spain. Written informed consent was provided by the parents of the studied children.
Nine 46,XY DSD and four 46,XX DSD patients carrying the p.Gly146Ala variant in the NR5A1/SF-1 gene were analyzed using whole exome sequencing (WES) or a targeted gene panel comprised of 48 genes associated with sex determination, sex differentiation and hypogonadism (S1 Table).

Genetic analysis
Genomic DNA from patients was extracted from peripheral blood leukocytes using the Mag-Purix 12S system (Zinexts Life Science Corp.). DNA purity and concentration were determined using a Qubit 2.0 fluorometer (Thermo Fisher Scientific). Blood or DNA of family members of any index cases was not available for testing.
Initial characterization for the NR5A1/SF-1 p.Gly146Ala variant was done by a DSD specific panel. Additional NGS was performed by WES ( Fig 1A). For WES, libraries were prepared using the Illumina DNA/RNA Prep Tagmentation PCR reagents (Illumina) and Illumina Exome Panel (CEX) according to the manufacturer's instructions. The resulting libraries were subjected to paired-end sequencing on a NovaSeq 6000 System (Illumina). Raw data were then converted to FastQ files for computational analysis, which included read alignment to the human genome reference sequence (GRCh38), duplicate marking, base quality score recalibration, indel realignment, and variant calling with an in-house bioinformatics pipeline using BWA-MEM [34] and GATK [35] software. Variants were annotated with ANNOVAR [36] and filtration process of the exonic variants was performed using R software (R 4.2.0). Variant filtration was performed following specific steps as given in Fig 1B. These steps included the filtration of WES data by a DSD-and NR5A1-related gene list. These disease-tailored lists were updated from previously reported work (DSD-gene list, N = 584; NR5A1-related gene list, N = 628) [26] (Fig 1B, step a). Then, we kept the resulting variants with all type of predicted consequences (e.g. nonsynonymous, frameshift deletion, stop/gain), but synonymous, and with a number of reads above or equal to 20 (Fig 1B steps b and c) [37]. Next, we filtered to include variants with a MAF�0.01 according to gnomAD (v3.1.2) and depending on the origin and karyotype of the patient (Fig 1B, step d). In step e, we confirmed the correct annotation, location of variants and zygosity by checking their alignment data in IGV (Integrative Genomics Viewer). Finally, we predicted the possible effect of

PLOS ONE
Other gene variants are likely explaining the DSD in patients carrying the NR5A1/SF-1 p.Gly146Ala the identified variant (see below) ( Fig 1B, step f). Variants were confirmed by visual examination using the IGV (Integrative Genomics Viewer) browser [38,39]. For the targeted DSD-gene panel analysis, preparation of the libraries and sequencing have been described elsewhere [27]. For variant filtration after panel analysis, same steps b to f were followed ( Fig 1B).

In silico analyses and variant classification
We predicted the possible effect of identified nonsynonymous genetic variants on the structure and function of the protein using Polyphen-2, (Polymorphism Phenotyping v2), Panther (Protein ANalysis THrough Evolutionary Relationships), SNPs and GO, CADD (Combined Annotation Dependent Depletion) and the calibrated scores given by VarSome [40] for Revel (Rare Exome Variant Ensemble Learner), SIFT (Scale-invariant feature transform), Provean (Protein Variation Effect Analyzer), Mutation taster and M-CAP (Mendelian Clinically Applicable Pathogenicity) (see S3 Table). Variants were classified for pathogenicity according to the standards and guidelines of the ACMG [9] using VarSome. We considered variants as candidates when classified as pathogenic, likely pathogenic or as VUS by the ACMG criteria or when classified as pathogenic or VUS by at least 7 out of 9 prediction programs. Previously reported clinical associations were searched in ClinVar and HGMD databases. In addition, the literature (e.g. PubMed) was searched for evidence of relationship with DSD, sex development and the specific clinical phenotype of each study subject. We explored the possible disease-causing variants' combinations using ORVAL (Oligogenic Resource for Variant AnaLysis) [41].

Results and discussion
In a random cohort of 124 subjects with a DSD, we identified the NR5A1/SF-1 p.Gly146Ala variant in 13 individuals (10.5%). The prevalence in 46,XY DSD subjects was 9.2% (9/98), and was in line with previously reported prevalence in this population [18,19]. Prevalence was higher in 46,XX DSD (4/24, 16.7%). Of the 13 studied subjects, four were homozygous and nine heterozygous for the NR5A1/SF-1 p.Gly146Ala variant. The phenotype of the individuals ranged from typical for karyotype to mild and severe atypical in 46,XY as well as opposite sex in both 46,XY and 46,XX (Fig 2). Patients were of African (7/13), Spanish (4/13) and Asian (2/ 13) origin. A summary of the clinical and biochemical characteristics of the patients is given in Table 1 and S2 Table. An overview of the identified genes of our study subjects that likely play a role for the DSD phenotype in a concerted way is given in NGS performed in DSD individuals harboring the p.Gly146Ala variant in NR5A1/SF-1 revealed several deleterious/candidate variants that potentially explain the phenotype of the patients. Overall, we identified either a known pathogenic DSD variant or one to four potentially deleterious/candidate variants in 9 out of the 13 DSD individuals analyzed. A detailed summary of identified variants in other DSD-related genes is shown in Table 2 (further details in S3 and S4 Tables).
In two patients we detected variants in known DSD-causing genes with our targeted gene panel, e.g. LHCGR and AR. In 11 patients WES was performed and variants were filtered by candidate gene lists (Fig 1). Overall, the NGS analysis identified 63 variants categorized as (likely) pathogenic or VUS in 57 different genes, however further review of evidence of correlation with DSD, sex development and phenotype of each patient with literature reduced this number to 19 potentially deleterious/candidate variants in 17 genes in nine subjects. In eight 46,XY DSD individuals 1-4 variants were found in a total of 15 genes, while one 46,XX DSD person revealed two variants in two different genes ( Table 2). All variants, identified either by gene panel or WES, but one (e.g. LHCGR), were detected in heterozygosis or hemizygosis. Details of the rejected variants are given in S4 Table. In patient 1 two frameshift deletions in genes FGFR3 (c.1633_1634del; p.Cys545Hisfs*17) and INSR (c.660_661del; p.Pro220Hisfs*4) and a stop-gain variant in ADAMTS16 (c.1822_1823del; p.His608*) were found and predicted to be likely pathogenic by the ACMG criteria. FGFR3 is essential for testicular development in humans [44], while INSR and ADAMTS16 are needed in murine genitourinary development and testicular differentiation, respectively [45,46]. Variants in ADAMTS16 have also been reported in heterozygosis in two 46,XY females with complete gonadal dysgenesis and a 46,XY DSD patient with ambiguous genitalia [47]. Testing for a digenic combination network with ORVAL showed no variant interaction between ADAMTS16 and FGFR3.
We detected four heterozygous missense variants in four genes in patient 3. These were GLI2 (c.3528G>T; p.Gln1176His), CDH7 (c.1623C>A; p.His541Gln), MYO7A (c.2882G>A; p.Gly961Asp) and VDR (c.176C>T; p.Thr59Ile). The variant in GLI2 (c.3528G>T; p. Gln1176His) was rated as pathogenic by most of the in silico prediction tools and variants in additional genes were rated as VUS when analyzing according to pathogenicity guidelines. Variants in GLI2 have been described in syndromic DSD patients together with short stature, primary hypogonadism, heart and craniofacial anomalies and 46,XY gonadal dysgenesis [48], as well as in 46,XY non-syndromic DSD manifesting with female external genitalia or with ambiguous genitalia [26, 49,50]. Variants in CHD7 have been previously associated with a broad range of 46,XY DSD phenotypes, including males with hypospadias or micropenis and individuals with female external genitalia [30,51]. MYO7A is overexpressed in male supporting cells during gonadal development [52] and has been shown to be a SRY and SOX9 target gene [53], but, in DSD individuals it has been identified only in combination with MAMLD1 [50,54]. Finally, VDR plays a dominant role in male fertility as Vdr-/-mice show abnormal gonads in both sexes and variable reproductive phenotypes such as reduced sperm count [55]. In humans, two polymorphisms in VDR were associated with female idiopathic infertility only [56]. Fertility of patient 3 has not been assessed yet, and we cannot exclude a role of the VDR variant in his DSD phenotype. Network analysis by ORVAL predicts a pathogenic gene network between CHD7, MYO7A and GLI2 (S1 Fig). A heterozygous missense c.182C>A; p.Pro61Gln variant in Neuropilin 1 (NRP1) gene was found in patient 6. NRP1 interacts with Sema3A which is essential for the development of the GnRH neuron system [57]. Loss of Sema3a (Semaphorin 3A) signaling in mice results in reduced gonadal size and recapitulates the features of Kallmann syndrome [57]. In humans, variants in NRP1 have been identified in a 46,XY DSD subject with female external genitalia

PLOS ONE
Other gene variants are likely explaining the DSD in patients carrying the NR5A1/SF-1 p.Gly146Ala [51] and a 46,XY male presenting with penoscrotal hypospadias, in whom other genetic variants were identified, among them a variant in MAMLD1, a known DSD-related gene [54]. In 46,XY patient 8 with a phenotype of opposite sex a homozygous, inactivating variant in LHCGR (c.757T>C; p.Ser253Pro) was found. This variant has been previously reported to severely reduce the signal transduction activity of the LH receptor and therefore leads to the complete form of Leydig cell hypoplasia (LCH) as suspected in patient 8 [58].
A missense variant in COL27A1 (c.3715C>T; p.Arg1239Trp) and a frameshift insertion in TYRO3 (c.666_667insCACTGCCTGCAGCCCCCTTCAACATCACC; p.Ala223HisfsTer21) were found in patient 9. Both variants were categorized as VUS and were detected in heterozygosis. In mice, Col27a1 is highly expressed in XY somatic supporter cells compared to XX during the earliest stages of gonad development [59]. Col27a1 has been identified as a SRY target gene in the embryonic mouse gonads at E11.5 by ChIP-Chip experiments [53]. Similarly, Tyro3 is overexpressed in male somatic cells [60], and is regulated by SOX9 [53]. Protein truncating variants of TYRO3 were found in individuals with idiopathic hypogonadotropic hypogonadism establishing a role of this gene in reproductive development [61]. Taken together, the data suggest that both COL27A1 and TYRO3 genes might be part of the genetic network underlying the early stages of mammalian fetal gonadal development. However, additional studies are needed to verify that these genetic variants are causing the ovotesticular DSD phenotype in patient 10. Moreover, a gene interaction between COL27A1 and TYRO3 was not predicted by ORVAL.
In 46,XY patient 10 with a distal hypospadias, one missense variant in the SOX8 (c.676A>C; p.Thr226Pro) gene was detected. It was identified in heterozygosis and was classified as VUS. SOX8 is involved in early testis determination [62]. SOX8 gene variants are associated with a range of phenotypes including 46,XY DSD and human reproductive anomalies in males and females [63]. Single-nucleotide variants (SNV) associated with 46,XY gonadal dysgenesis are mostly located in the HMG-box of SOX8 [49], while those reported in infertile patients flank the HMG-box or localize to one of the transactivation domains (TA) [64]. However, more recently, a missense variant in the TA2 region of SOX8 was identified in a 46,XY patient with gonadal dysgenesis [49]. The novel c.676A>C; p.Thr226Pro variant is located in the first TA of the protein. In vitro studies have shown impaired cellular localization in some mutant proteins located in this functional domain of SOX8. Therefore, this missense variant likely explains the genital phenotype observed in patient 10. At age 14 years, biochemical assessment of the HPG axis was normal and pubertal development was ongoing (Tanner 3-4).

PLOS ONE
Other gene variants are likely explaining the DSD in patients carrying the NR5A1/SF-1 p.Gly146Ala Four heterozygous VUS or likely pathogenic variants were identified in patient 11 with a severe 46,XY DSD phenotype. These were POR (c.1679C>T; p.Thr560Met), PKD1 (c.2624C>T; p.Pro875Leu), SRCAP (c.7142G>A; p.Arg2381His) and SOX9 (c.710dup; p. Pro238Thrfs*14). The involvement of POR and SOX9 in sexual development is well known and several sequence variants have been described in 46,XY DSD patients [30,53,65]. The patient showed the missense POR p.Thr560Met variant in compound heterozygosity with the p.Ala500Val (c.1499C>T) polymorphism. A previous report suggested that the combination of a pathogenic POR variant and a polymorphism may cause CAH [66] However, to confirm a disease-causing effect of the POR variants for the DSD phenotype in our patient, functional studies including the specific variants would be needed. Pkd1 is critical for epididymal epithelium development and for maintaining mice male reproductive tract [67]. PKD1 variants have not been related to DSD yet, but they cause autosomal dominant polycystic kidney disease (ADPKD), which involves reproductive tract abnormalities and infertility in males [63]. Therefore, a role of PKD1 variants in DSD seems possible. Likewise, the role of SRCAP in sex differentiation and development is unknown. However, this is the second 46,XY DSD patient in whom a gene variant is identified [48]. According to ORVAL analysis, oligogenic pathogenicity is predicted by combination of variants in a gene network including POR, PKD1 and SRCAP (S1 Fig).
In patient 12, we identified an AR variant (c.2323C>T; p.Arg775Cys) previously reported in a patient with Complete Androgen Insensitivity Syndrome (CAIS) [43]. Because the patient presented with a typical CAIS phenotype, it seems plausible that this hemizygous AR variant is fully responsible for the DSD.
Patient 13, with a severe 46,XY phenotype, harbored two heterozygous missense variants in MYO7A and SOX8 genes. Both were categorized as VUS by the ACMG. As in patient 10, the SOX8 variant (c.694A>C; p.Lys232Gln) was also located in the TA1 domain of the protein. However, the phenotype of patient 13 was more severe, either caused by the SOX8 variant alone or due to the digenic effect together with MYO7A. Importantly, the combination of variants in SOX8 and MYO7A is predicted as disease-causing by ORVAL (S1 Fig). The combination of variants in MYO7A and SOX8 in DSD was reported previously [50,54], and suggests that the broad phenotype observed in DSD individuals might be explained by oligogenic origin [2].
In four patients carrying the heterozygous p.Gly146Ala NR5A1 variant, the WES and specific data analysis revealed no other candidate genes explaining their DSD phenotype. Of these patients 2, 4 and 5 had a 46,XX karyotype and an opposite genital phenotype, and were assigned as males at birth, whereas patient 7 presented with a severe 46,XY DSD. All of them had no other organ anomalies. Although NGS has facilitated the identification of the underlying genetic defects of DSD, about 50% of individuals with a 46,XY DSD remain without a molecular diagnosis with currently used methods [30]. We used WES in our study, while other genetic studies also search for variants in noncoding, regulatory or intronic regions by whole genome sequencing (WGS). But even when using WGS, a considerable number of patients are still reportedly unsolved [68]. Thus, other factors such as environmental factors or epigenetic regulators have been suggested playing a role [68,69]. In addition, oligogenic or even polygenic inheritance might be considered for explaining the broad phenotypes seen in some individuals with a DSD [3,[26][27][28][29][30][31][32]54,[70][71][72][73]. In early days of genetic workup of DSD, patients were studied by candidate Sanger sequencing. In 46,XY DSD subjects typical candidates were the AR, SRD5A2 and NR5A1/SF-1; and once a genetic variant was found, additional genes were not tested. Thus, some DSD patients that have been tested by the candidate approach may not have a correct diagnosis and need to be retested by NGS.
Our study suffers from some limitations. The disease-causing effect of identified variants was assessed with bioinformatics tools and the current knowledge from literature only. Ideally, novel genetic variants should be functionally tested for definite proof of their pathogenic effect. But this is not an easy task when finding multiple candidates, as both cell models as well as animal models have their limitations. We and others try to overcome this obstacle in the near future by using patient-derived fibroblasts comprising the original (complex) genetic background and reprogramming into corresponding gonadal and adrenal cell lines for mechanistic studies [74]. Additionally, trio exome or parental sequencing would help to assess the mode of inheritance and the clinical relevance of variants by looking at genotype-phenotype correlations, but unfortunately DNA of relatives is not always available (as in our case).
In conclusion, NGS genetic analysis of DSD individuals carrying the p.Gly146Ala variant of the NR5A1/SF-1 gene revealed variants in other genes (likely) explaining their phenotype. These gene variants were either found in established DSD genes, were previously described or novel, and were (likely) disease-causing either in a monogenic or in a suggested oligogenic fashion. Although we were not able to find causal genetic variants in four out of 13 DSD individuals carrying the NR5A1/SF-1 p.Gly146Ala, our study supports the claim that this NR5A1/ SF-1 variant is a benign polymorphism that does not play a role in the pathogenesis of DSD. Therefore, we strongly recommend reanalyzing DSD patients whose phenotype has been thought to be explained by this variant in order to find the real underlying genetic cause.

S1 Fig. Potential oligogenic interaction networks of DSD-and NR5A1-related genes identified in specific DSD individuals harbouring the NR5A1/SF-1 p.Gly146Ala variant.
Networks were identified for patients 3, 11 and 13 respectively. To search for potential oligogenic disease networks, the Oligogenic Resource for Variant AnaLysis (ORVAL, https://orval. ibsquare.be/) was used. Nodes represent genes and edges connect two genes only, if between them there is at least one candidate disease-causing variant combination predicted by Var-CoPP. The colour of the edge represents the pathogenicity score for that pair of genes. This score is represented in a colour range from brown (higher pathogenicity score) to yellow (lower pathogenicity score). (DOCX) S1 Table. (7)